Sense Embeddings in Knowledge-Based Word Sense Disambiguation
نویسندگان
چکیده
In this paper, we develop a new way of creating sense vectors for any dictionary, by using an existing word embeddings model, and summing the vectors of the terms inside a sense’s definition, weighted in function of their part of speech and their frequency. These vectors are then used for finding the closest senses to any other sense, thus creating a semantic network of related concepts, automatically generated. This network is hence evaluated against the existing semantic network found in WordNet, by comparing its contribution to a knowledge-based method for Word Sense Disambiguation. This method can be applied to any other language which lacks such semantic network, as the creation of word vectors is totally unsupervised, and the creation of sense vectors only needs a traditional dictionary. The results show that our generated semantic network improves greatly the WSD system, almost as much as the manually created one.
منابع مشابه
Semi-Supervised Word Sense Disambiguation Using Word Embeddings in General and Specific Domains
One of the weaknesses of current supervised word sense disambiguation (WSD) systems is that they only treat a word as a discrete entity. However, a continuous-space representation of words (word embeddings) can provide valuable information and thus improve generalization accuracy. Since word embeddings are typically obtained from unlabeled data using unsupervised methods, this method can be see...
متن کاملDistributional Lesk: Effective Knowledge-Based Word Sense Disambiguation
We propose a simple, yet effective, Word Sense Disambiguation method that uses a combination of a lexical knowledge-base and embeddings. Similar to the classic Lesk algorithm, it exploits the idea that overlap between the context of a word and the definition of its senses provides information on its meaning. Instead of counting the number of words that overlap, we use embeddings to compute the ...
متن کاملIntegrating WordNet for Multiple Sense Embeddings in Vector Semantics
Popular distributional approaches to semantics allow for only a single embedding of any particular word. A single embedding per word conflates the distinct meanings of the word and their appropriate contexts, irrespective of whether those usages are related or completely disjoint. We compare models that use the graph structure of the knowledge base WordNet as a post-processing step to improve v...
متن کاملBiomedical Word Sense Disambiguation with Neural Word and Concept Embeddings
OF THESIS Biomedical Word Sense Disambiguation with Neural Word and Concept Embeddings Addressing ambiguity issues is an important step in natural language processing (NLP) pipelines designed for information extraction and knowledge discovery. This problem is also common in biomedicine where NLP applications have become indispensable to exploit latent information from biomedical literature and ...
متن کاملDetecting Most Frequent Sense using Word Embeddings and BabelNet
Since the inception of the SENSEVAL evaluation exercises there has been a great deal of recent research into Word Sense Disambiguation (WSD). Over the years, various supervised, unsupervised and knowledge based WSD systems have been proposed. Beating the first sense heuristics is a challenging task for these systems. In this paper, we present our work on Most Frequent Sense (MFS) detection usin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017